{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# [17、ConvMixer模型原理及其PyTorch逐行实现](https://www.bilibili.com/video/BV1K34y1o74P?spm_id_from=333.788.player.switch&vd_source=cdd897fffb54b70b076681c3c4e4d45d)\n",
    "\n",
    "- [Patches Are All You Need?](https://arxiv.org/pdf/2201.09792)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "卷积神经网络一直是视觉领域的主导架构，但是2022年的最近，基于ViT的Transformer架构模型也表现得很好，ViT采用了图像块嵌入（把图像分割成一个个小patch作为单元）。那么ViT架构模型的表现来自 **Transformer** 还是 **图像Patch** 呢？ConvMixer 论文发现，ViT模型的强大更多源于其基于图像块（patch）的表示方式，而非 Transformer 架构本身。\n",
    "\n",
    "![image-20250619224855366](http://assets.hypervoid.top/img/2025/06/19/image-20250619224855366-2d03.png)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch.nn as nn\n",
    "\n",
    "\n",
    "def ConvMixer(h, depth, kernel_size=9, patch_size=7, n_classes=1000):\n",
    "    Seq, ActBn = nn.Sequential, lambda x: Seq(x, nn.GELU(), nn.BatchNorm2d(h))\n",
    "    Residual = type(\"Residual\", (Seq,), {\"forward\": lambda self, x: self[0](x) + x})\n",
    "    return Seq(\n",
    "        ActBn(nn.Conv2d(3, h, patch_size, stride=patch_size)),\n",
    "        *[\n",
    "            Seq(\n",
    "                Residual(ActBn(nn.Conv2d(h, h, kernel_size, groups=h, padding=\"same\"))),\n",
    "                ActBn(nn.Conv2d(h, h, 1)),\n",
    "            )\n",
    "            for i in range(depth)\n",
    "        ],\n",
    "        nn.AdaptiveAvgPool2d((1, 1)),\n",
    "        nn.Flatten(),\n",
    "        nn.Linear(h, n_classes)\n",
    "    )"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "dl",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
